-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CDNA supports only 64 wave front size, for those GPUs set subgroup size to 64. Some GPUS support both 32 and 64, for those (and the rest) only allow 32.
8c22cd8
to
ac489a5
Compare
Friendly ping @intel/dpcpp-cfe-reviewers |
Added a test to make sure that the values are indeed ignored: 0a72b7b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test failure: |
That's right, this patch changes the handling of incorrect sub-group attribute, such that it doesn't end up in the resulting binary image. There are two ways of thinking about that issue:
I'm not sure which is the correct way of handling those values. |
I don't think it makes sense allowing an invalid subgroup size to pass through if we're emitting the warning. IMO we should either just allow the incorrect value to pass through without a warning and then have the backend deal with it however it is doing so currently, or we do what this PR does - i.e. emit warning and drop the attribute. @AlexeySachkov can you weigh in here? Is there a reason we are passing though invalid subgroup size? |
Any preference on this @AlexeySachkov ? |
@AlexeySachkov friendly ping on this one please. |
From SYCL spec point of view, there are no compile-time known incorrect sub-group sizes - what we have for AMD and CUDA are implementation details of those backends. Explicitly requesting sub-group size of a kernel has limited portability by design, but considering how fundamental and narrow the selection of possible sub-group sizes for AMD/CUDA, it might make sense to promote that knowledge into the compiler in form of the suggested warning to improve user experience. I think that it may actually make sense to pass the invalid sizes through even if we diagnosed them with a warning: this way runtime handling of the attribute for AMD/CUDA backend would be uniform with other backends and users will get more consistent experience. At the same time the warning will let users know in advance that their application won't work and it requires some code changes. The only concern about the warning I have is that it may produce false alarms: what if there is a kernel which is never submitted to a AMD/CUDA device? The warning will still be there and it may be annoying for someone who uses |
Following you suggestion, I've let the invalid value pass through. I've removed a test, since no values are being ignored.
I feel that it's an edge case, that is outweighed by the benefit of informing the users about the incorrect size, and perhaps we could ignore it for now. If that ever becomes an issue, it would be simple enough to provide a mechanism to silence this particular warning? |
I think that we may encounter that sooner than later, but I do agree that benefit of the warning probably outweighs the |
As a person compiling with It looks like currently the required error suppression would be too broad. Originally (#6183), |
CDNA supports only 64 wave front size, for those GPUs allow subgroup size of 64. Some GPUs support both 32 and 64, for those (and the rest) only allow 32.